Fusion of children's speech and 2D gestures when conversing with 3D characters

نویسندگان

  • Jean-Claude Martin
  • Stéphanie Buisine
  • Guillaume Pitel
  • Niels Ole Bernsen
چکیده

Most existing multi-modal prototypes enabling users to combine 2D gestures and speech input are task-oriented. They help adult users solve particular information tasks often in 2D standard Graphical User Interfaces. This paper describes the NICE Andersen system, which aims at demonstrating multi-modal conversation between humans and embodied historical and literary characters. The target users are 10–18 years old children and teenagers. We discuss issues in 2D gesture recognition and interpretation as well as temporal and semantic dimensions of input fusion, ranging from systems and component design through technical evaluation and user evaluation with two different user groups. We observed that recognition and understanding of spoken deictics were quite robust and that spoken deictics were always used in multimodal input. We identified the causes of the most frequent failures of input fusion and suggest possible improvements for removing these errors. The concluding discussion summarises the knowledge provided by the NICE Andersen system on how children gesture and combine their 2D gestures with speech when conversing with a 3D character, and looks at some of the challenges facing theoretical solutions aimed at supporting unconstrained speech/2D gesture fusion. r 2006 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fusion of Children's Speech and 2D Gestures when Interacting with 3D Embodied Conversational Characters

Most of the existing multimodal prototypes enabling users to combine 2D gestures and speech are task-oriented. They help adult users to solve particular information tasks often in 2D standard Graphical User Interfaces. This paper describes the NICE HCA system which aims at demonstrating multimodal conversation between humans and embodied historical and literary characters. The target users are ...

متن کامل

Multimodal Input Fusion in Human-computer Interaction

In this paper, we address the modality integration issue on the example of a system that aims at enabling users to combine their speech and 2D gestures when interacting with life-like characters in an educative game context. In a preliminary limited fashion, we investigate and present the use of combined input speech, 2D gesture and environment entities for user system interaction.

متن کامل

Hand Gesture Recognition from RGB-D Data using 2D and 3D Convolutional Neural Networks: a comparative study

Despite considerable enhances in recognizing hand gestures from still images, there are still many challenges in the classification of hand gestures in videos. The latter comes with more challenges, including higher computational complexity and arduous task of representing temporal features. Hand movement dynamics, represented by temporal features, have to be extracted by analyzing the total fr...

متن کامل

Mental Timeline in Persian Speakers’ Co-speech Gestures based on Lakoff and Johnson’s Conceptual Metaphor Theory

One of the introduced conceptual metaphors is the metaphor of "time as space". Time as an abstract concept is conceptualized by a concrete concept like space. This conceptualization of time is also reflected in co-speech gestures. In this research, we try to find out what dimension and direction the mental timeline has in co-speech gestures and under the influence of which one of the metaphoric...

متن کامل

Extracting and analysing co-speech head gestures from motion-capture data

This paper reports on a method developed for extracting and analyzing head gestures taken from motion capture data of spontaneous dialogue in Swedish. The head gestures were extracted automatically and then manually classified using a 3D player which displays time-synced audio and 3D point data of the motion capture markers together with animated characters. Prosodic features were extracted fro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Signal Processing

دوره 86  شماره 

صفحات  -

تاریخ انتشار 2006